\documentclass{bioinfo}
\copyrightyear{2005}
\pubyear{2005}
\usepackage{url}
\usepackage[normalem]{ulem}
\usepackage{multirow}
\renewcommand{\cite}{\citep}
%DIF PREAMBLE EXTENSION ADDED BY LATEXDIFF
%DIF UNDERLINE PREAMBLE %DIF PREAMBLE
\RequirePackage[normalem]{ulem} %DIF PREAMBLE
\RequirePackage{color}\definecolor{RED}{rgb}{1,0,0}\definecolor{BLUE}{rgb}{0,0,1} %DIF PREAMBLE
\providecommand{\DIFadd}[1]{{\protect\color{blue}\uwave{#1}}} %DIF PREAMBLE
\providecommand{\DIFdel}[1]{{\protect\color{red}\sout{#1}}}                      %DIF PREAMBLE
%DIF SAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddbegin}{} %DIF PREAMBLE
\providecommand{\DIFaddend}{} %DIF PREAMBLE
\providecommand{\DIFdelbegin}{} %DIF PREAMBLE
\providecommand{\DIFdelend}{} %DIF PREAMBLE
%DIF FLOATSAFE PREAMBLE %DIF PREAMBLE
\providecommand{\DIFaddFL}[1]{\DIFadd{#1}} %DIF PREAMBLE
\providecommand{\DIFdelFL}[1]{\DIFdel{#1}} %DIF PREAMBLE
\providecommand{\DIFaddbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFaddendFL}{} %DIF PREAMBLE
\providecommand{\DIFdelbeginFL}{} %DIF PREAMBLE
\providecommand{\DIFdelendFL}{} %DIF PREAMBLE
%DIF END PREAMBLE EXTENSION ADDED BY LATEXDIFF

\begin{document}
\firstpage{1}

\title[Physiology phenotypes]{Semantic integration of physiology
  phenotypes with an application to the Cellular Phenotype Ontology}

\author[Hoehndorf \textit{et~al}]{Robert Hoehndorf$^{1}$\footnote{to
    whom correspondence should be addressed}, Midori
  A. Harris$^2$, Heinrich Herre$^3$, Gabriella Rustici$^4$ and
  Georgios V. Gkoutos$^{1}$}

\address{$^{1}$Department of Genetics, University of Cambridge,
  Downing Street, Cambridge, Cambridge CB2 3EH, UK\\
  $^{2}$Department of Biochemistry; University of Cambridge, 80 Tennis
  Court Road, Cambridge CB2 1GA, UK\\
  $^{3}$Institute for Medical Informatics, Statistics and
  Epidemiology, University of Leipzig, Haertelstrasse 16-18, 04107
  Leipzig, Germany\\
  $^{4}$European Bioinformatics Institute, Wellcome Trust Genome
  Campus, Hinxton, Cambridge, Cambridge CB10 1SD, UK}

\history{Received on XXXXX; revised on XXXXX; accepted on XXXXX}

\editor{Associate Editor: XXXXXXX}

\maketitle

\begin{abstract}
\section{Motivation:}
The systematic observation of phenotypes has become a crucial tool of
functional genomics, and several large international projects are
currently underway to identify and characterize the phenotypes that
are associated with genotypes in several species. To integrate
phenotype descriptions within and across species, phenotype ontologies
have been developed. Applying ontologies to unify phenotype
descriptions in the domain of physiology has been a particular
challenge due to the high complexity of the underlying domain.

\section{Results:}
Here, we present the outline of a theory and its implementation for an
ontology of physiology-related phenotypes. We provide a formal
description of process attributes and relate them to the attributes of
their temporal parts and participants. We apply our theory to create
the Cellular Phenotype Ontology (CPO). The CPO is an ontology of
morphological and physiological phenotypic characteristics of cells,
cell components and cellular processes. Its prime application is \DIFdelbegin \DIFdel{the
unification of cellular phenotype descriptions across species by
providing }\DIFdelend \DIFaddbegin \DIFadd{to
provide }\DIFaddend terms and uniform definition patterns \DIFaddbegin \DIFadd{for the annotation of
cellular phenotypes}\DIFaddend . The CPO can \DIFdelbegin \DIFdel{further
}\DIFdelend be used for the annotation of
observed abnormalities in domains, such as systems microscopy, in
which cellular abnormalities are observed and for which no phenotype
ontology has been created.

\section{Availability and implementation:} 
The CPO and the source code we generated to create the CPO are freely
available on \url{http://cell-phenotype.googlecode.com}.

\section{Contact:} \href{rh497@cam.ac.uk}{rh497@cam.ac.uk}
\end{abstract}

\section{Introduction}
Phenotype studies on all scales and levels of granularity are now an
invaluable tool for functional genomics research. Phenotypes of
targeted mutations in animal models are now systematically recorded to
reveal the role of individual genes within a biological system. These
phenotype studies now play a key role in translational research and
are being used to reveal candidate genes for orphan diseases and to
identify chemicals that may have effects on these diseases
\cite{Schofield2011}.

The large volume and diversity of phenotypes within different species
and across multiple scales and levels of granularity necessitates the
application of flexible strategies for managing and integrating data
so that it becomes amenable to automated comparative analyses. To
integrate biomedical data across heterogeneous information systems,
biomedical ontologies are being developed \DIFdelbegin %DIFDELCMD < \cite{Smith2007}%%%
\DIFdelend \DIFaddbegin \cite{Smith2007short}\DIFaddend . An
ontology is an explicit specification of a conceptualization of a
domain and can be used to make the meaning of terms in a vocabulary
explicit \cite{Gruber1995, Guarino1998}. They play a crucial role in
the annotation of biomedical data and the integration of model
organism databases \cite{go2010, Bada2004, goble}.

Ontologies increasingly rely on the use of Semantic Web technologies
\cite{Berners-Lee2001}. The Semantic Web provides a stack of protocols
and languages to include explicit semantics in websites. In
particular, the Web Ontology Language (OWL) \cite{Grau2008} has been
designed to express and share ontologies within the Semantic Web. OWL
is a language based on description logics (a group of formal languages
based on first-order predicate logic). Automated reasoners have been
developed within the Semantic Web to perform complex operations on
ontologies formulated in OWL. In particular, automated reasoners can
verify an ontology's consistency and use deductive inference to
perform powerful queries over ontologies. To benefit from automated
reasoning and the rapidly increasing number of software tools that are
being developed within the Semantic Web, most biomedical ontologies
are now available in OWL or can be converted into an OWL-based
representation \cite{Horrocks2007, Hoehndorf2010patterns}.

In the domain of phenotypes, multiple ontologies have been
developed. In particular, ontologies to characterize mammalian
\cite{Smith2004}, human \cite{Robinson2008}, yeast \DIFdelbegin %DIFDELCMD < \cite{ypo} %%%
\DIFdelend \DIFaddbegin \cite{aposhort} \DIFaddend and
worm \cite{wpo} phenotypes are now available, while several more
phenotype ontologies are under development. To benefit from automated
reasoning, integrate phenotypes across species and reuse the content
of anatomy and process ontologies, we have defined process classes
using the framework of the Phenotypic Attribute and Trait (PATO)
ontology \cite{Gkoutos2005}. According to the PATO framework, a
phenotype can be decomposed, using an Entity-Quality model, into an
affected entity and a quality that characterizes {\em how} the entity
is affected \cite{Gkoutos2005}. Such decompositions have been created
for several widely used phenotype ontologies \cite{Mungall2010,
  Gkoutos2009b, obml2011h1}, and are being applied together with
methods for reusing knowledge contained in anatomy ontologies
\cite{Mungall2010, Hoehndorf2010phene}.

While the PATO framework is now successfully being applied to
semantically integrate phenotypes across species, the diversity and
complexity of phenotypes in which biological processes and functions
are impaired continues to limit the interoperability between phenotype
ontologies. Major challenges for representing and integrating process
phenotypes include establishing the link to the components of
biological systems that have the capabilities to exhibit such a
behaviour, and that attributes of processes are often measured {\em
  indirectly} and inferred from other attributes.

\DIFdelbegin \DIFdel{To illustrate these challenges, consider physiological processes of
the heart. One of the heart's functions is }%DIFDELCMD < {\em %%%
\DIFdel{Heart beating}%DIFDELCMD < }%%%
\DIFdel{, i.e.,
a capability that is realized through processes of the type }%DIFDELCMD < {\em %%%
\DIFdel{Heart
  beating}%DIFDELCMD < }%%%
\DIFdel{. }%DIFDELCMD < {\em %%%
\DIFdel{Blood}%DIFDELCMD < } %%%
\DIFdel{is a participant of }%DIFDELCMD < {\em %%%
\DIFdel{Heart beating}%DIFDELCMD < }
%DIFDELCMD < %%%
\DIFdel{processes.  An abnormal phenotype of an organism could be that the
}%DIFDELCMD < {\em %%%
\DIFdel{rate of heart beating}%DIFDELCMD < } %%%
\DIFdel{is increased. The intended meaning of such
a description is that the number of }%DIFDELCMD < {\em %%%
\DIFdel{Heart beating}%DIFDELCMD < } %%%
\DIFdel{processes in a
given time interval is higher than normal. Another important attribute
of heart physiology is the fluid flow rate through the heart. For an
abnormal phenotype such as }%DIFDELCMD < {\em %%%
\DIFdel{increased rate of fluid flow through
  the heart}%DIFDELCMD < }%%%
\DIFdel{, the intended meaning could either be that the amount of
fluid that is moved through the heart within a single heart beating
process is increased or that the amount of fluid that is moved through
the heart within a period of time interval is increased. 
}\DIFdelend %DIF >  To illustrate these challenges, consider physiological processes of
%DIF >  the heart. One of the heart's functions is {\em Heart beating}, i.e.,
%DIF >  a capability that is realized through processes of the type {\em Heart
%DIF >    beating}. {\em Blood} is a participant of {\em Heart beating}
%DIF >  processes.  An abnormal phenotype of an organism could be that the
%DIF >  {\em rate of heart beating} is increased. The intended meaning of such
%DIF >  a description is that the number of {\em Heart beating} processes in a
%DIF >  given time interval is higher than normal. Another important attribute
%DIF >  of heart physiology is the fluid flow rate through the heart. For an
%DIF >  abnormal phenotype such as {\em increased rate of fluid flow through
%DIF >    the heart}, the intended meaning could either be that the amount of
%DIF >  fluid that is moved through the heart within a single heart beating
%DIF >  process is increased or that the amount of fluid that is moved through
%DIF >  the heart within a period of time is increased. 

\DIFdelbegin \DIFdel{Based on these examples, we can make several observations about
process attributes. First, for a process like }%DIFDELCMD < {\em %%%
\DIFdel{Heart beating}%DIFDELCMD < }%%%
\DIFdel{, we
can distinguish between single occurrences and processes in which }%DIFDELCMD < {\em
%DIFDELCMD <   %%%
\DIFdel{Heart beating}%DIFDELCMD < } %%%
\DIFdel{occurs multiple times. Only the latter kind of
process may have a }%DIFDELCMD < {\em %%%
\DIFdel{rate of heart beating}%DIFDELCMD < }%%%
\DIFdel{, while }%DIFDELCMD < {\em %%%
\DIFdel{Heart
  beating}%DIFDELCMD < } %%%
\DIFdel{processes do not have such an attribute. Second, we can
distinguish between abnormal fluid flow rates in }%DIFDELCMD < {\em %%%
\DIFdel{Heart beating}%DIFDELCMD < }
%DIFDELCMD < %%%
\DIFdel{processes and rate of fluid flow through the heart within a given
duration. Both may have entirely different underlying causes and it is
therefore important to distinguish between them. Finally, we may be
able to infer some phenotypes from others, thereby limiting the number
of phenotypes that must be experimentally observed. For example, when
the fluid flow rate in single heart beating processes is increased and
the }%DIFDELCMD < {\em %%%
\DIFdel{rate of heart beating}%DIFDELCMD < } %%%
\DIFdel{is increased within a process $P$,
then the rate of fluid flow through the heart will be increased for
$P$.
}\DIFdelend %DIF >  Based on these examples, we can make several observations about
%DIF >  process attributes. First, for a process like {\em Heart beating}, we
%DIF >  can distinguish between single occurrences and processes in which {\em
%DIF >    Heart beating} occurs multiple times. Only the latter kind of
%DIF >  process may have a {\em rate of heart beating}, while {\em Heart
%DIF >    beating} processes do not have such an attribute. Second, we can
%DIF >  distinguish between abnormal fluid flow rates in {\em Heart beating}
%DIF >  processes and rate of fluid flow through the heart within a given
%DIF >  duration. Both may have entirely different underlying causes and it is
%DIF >  therefore important to distinguish between them. Finally, we may be
%DIF >  able to infer some phenotypes from others, thereby limiting the number
%DIF >  of phenotypes that must be experimentally observed. For example, when
%DIF >  the fluid flow rate in single heart beating processes is increased and
%DIF >  the {\em rate of heart beating} is increased within a process $P$,
%DIF >  then the rate of fluid flow through the heart will be increased for
%DIF >  $P$.

Here, we present the foundations for an ontology of process
phenotypes. We present \DIFdelbegin \DIFdel{the outline of }\DIFdelend a theory in which several kinds of process
attributes can be distinguished so that normal and abnormal physiology
of biological systems can be formally characterized. We apply this
theory to cellular processes and create the Cell Phenotype Ontology
(CPO). The CPO is linked to reference ontologies for qualities,
biological processes, functions and cell components, and its prime
application is the unification of phenotypes on the cellular level
across different species as well as for annotation of cellular
phenotypes in domains in which no such ontology exists.


\begin{methods}
\section{System and methods}
% \subsection{Formal ontology}
% The ontology as an approach to semantic standardisation was proposed
% more than a decade ago and since then has become the dominant
% methodology used to semantically categorise phenodeviance.  The
% biomedical research community has invested considerable effort and
% resources in the development and establishment of ontologies that are
% becoming increasingly successful as information management and
% integration tools in many disparate scientific fields allowing
% interoperability and semantic information processing between diverse
% biomedical resources and domains.

% In computer science, an ontology is a specification of a
% conceptualization of a domain of knowledge \cite{Gruber1995,
%   Guarino1998}.  Ontologies commonly distinguish between {\em classes}
% (also called {\em concepts}, {\em categories} or {\em universals}) and
% {\em individuals} within a domain of knowledge. A class is an entity
% that can have {\em instances}, while individuals are entities that
% cannot be instantiated \cite{Herre2006}. Examples of individuals
% include the Eiffel tower or the 2009 Ironman Triathlon in Hawaii,
% while examples of classes include {\em Tower} or {\em Triathlon}. The
% Eiffel tower can be an instance of the class {\em Tower}, and the 2009
% Ironman Triathlon an instance of {\em Triathlon}.  The meaning of
% classes is specified by stating what must be true of their instances.

% In addition to classes and individuals, ontologies often include {\em
%   relations}. Relations hold between entities, they are the ``the glue
% that holds things together, the primary constituents of the facts that
% go to make up reality'' \cite{Barwise1989}.

% In {\em formal} ontologies, the specification of classes and relations
% follows the axiomatic-deductive method. Given a set of terms that are
% used within a domain and whose meaning we wish to specify, we begin by
% providing {\em explicit definitions} for some terms, potentially
% introducing new terms. An explicit definition of a term $t$ is a
% statement that can replace every occurrence of $t$ in any sentence.

% Eventually, a set of {\em primitive terms} remains that are not
% further defined. Following the axiomatic method \cite{Hilbert1918},
% using only the primitive terms, we can construct complex
% sentences. Based on the intended meaning of the primitive terms, we
% consider some of these sentences true and some of them false in our
% domain. We select some of the true sentences as {\em axioms} which
% provide the core of our ontology. Ideally, the axioms are chosen so
% that all true sentences in the domain we intend to represent follow by
% means of logical deduction from the axioms. More commonly, however,
% only {\em some aspects} of the intended meaning are formally
% represented while other aspects are omitted either due to limitations
% in language expressivity or due to their irrelevance to the problem
% for which an ontology is developed.

% Based on the axioms and definitions, we can use deduction to infer
% statements that logically follow from the axioms.  The process of
% automatically deducing sentences from axioms is called {\em automated
%   reasoning}. Automated reasoning allows users of an ontology to carry
% out key activities: verifying the ontology's consistency, inferring
% hidden knowledge and thereby performing powerful queries.  An ontology
% is formally inconsistent if there is a statement $\phi$ such that
% $\phi$ and its negation $\neg \phi$ can be inferred from the
% ontology's axioms.  If an ontology is formally inconsistent, {\em
%   every} statement can be inferred from the ontology.  

% Automated reasoning can further determine whether classes in an
% ontology are unsatisfiable: a class $C$ is unsatisfiable, if it is
% impossible for the class to have any instances. Unsatisfiable classes
% in an ontology are commonly the result of a contradictory class
% definition. % That a class is unsatisfiable is often a special kind of
% % unintended consequence that can be drawn from an ontology.

% Automated reasoning in the Web Ontology Language (OWL) can be employed
% to automatically compute the generalization hierarchy underlying an
% ontology as well as for verification of data consistency and complex
% queries \cite{Hoehndorf2011incon, Hoehndorf2011models}. Highly
% efficient automated reasoners are available to process OWL ontologies
% \cite{Sirin2004, Tsarkov2006, Motik2009a}. OWL profiles were developed
% to support even large ontologies by further reducing the expressivity
% of OWL in order to enable polynomial-time inferences. In particular
% the OWL EL profile was found to provide the expressivity required for
% most biomedical ontologies \cite{el4, elvira}, and highly optimized
% OWL EL reasoners are available or under development to support
% reasoning over very large ontologies \cite{el4, cbreasoner}.

% A high expressivity is required to accurately specify complex axioms
% that constrain the domain under investigation, and languages with
% higher expressivity than OWL are often required in the biomedical
% domain to achieve this goal \cite{Hoehndorf2009sequences, rnao}. On
% the other hand, automated reasoning over large ontologies and
% associated datasets benefits from languages with a low complexity of
% inferences in which complex axioms cannot be formulated. Therefore, a
% possible solution is to use a layered approach: to specify the meaning
% of terms using an expressive language, and derive the axioms that
% must obtain in a weaker language using deductive inferences.

% \subsection{Processes and their participants}
% Most biomedical ontologies share common distinctions between different
% kinds of entities: physical objects, qualities, functions and
% processes. A physical object is an entity that is present as a whole
% at a time point, i.e., an entity that has no temporal parts. A quality
% is an attribute or feature of an entity. Physical objects together
% with their qualities give rise to functions, which are capabilities or
% potentials of physical objects. These functions can then be realized
% by processes, which are temporally extended entities
% \cite{Burek2006}. Examples of classes that may have processes as
% instances include {\em Drinking}, {\em Triathlon} or {\em Apoptosis}.
% Figure \ref{fig:onto} illustrates these basic categories of being.

% Processes commonly have physical objects as participants. Physical
% objects are entities that are present as a whole at time points (i.e.,
% they have no temporal parts) and may persist through time, i.e., they
% may undergo changes during the process \cite{Herre2006, Herre2010}.
% We can further distinguish different {\em roles} of participation in a
% process \cite{Loebe2007}. For example, a runner may participate in a
% {\em Triathlon} process as a {\em Runner (role)}, while another person
% can participate in the same process but in a different role (such as
% {\em Referee}).

% % Amongst the participants of some processes, we will distinguish
% % between {\em inputs} and {\em outputs} of processes. Chemical
% % reactions, for example, are processes with definite inputs (the {\em
% %   reactants}) and outputs ({\em the products}).

\subsection{Gene Ontology}
The Gene Ontology provides a set of ontologies for molecular and
cellular biology, originally designed to support structured
annotations for genes and gene products in all species with respect to
molecular function (MF), biological process (BP) and cellular
component (CC). MF and BP both describe processes, but at different
spatiotemporal scales; in particular, BP includes processes that
unfold within cells and within tissues and organs of multicellular
organisms. Gene product annotations can be interpreted as identifying
participants in the processes.

Over time, GO development has increasingly emphasized a normalized
approach that includes supplementing existing human-readable text
description with formally specified explicit definitions for GO
classes. The formalization of GO is readily apparent in its
representation of biological regulation.

Regulatory processes may regulate other processes, at either the MF or
BP scale, or biological qualities. GO accordingly includes three broad
categories of regulation terms, regulation of molecular function,
regulation of biological process, and regulation of biological
quality. The first two are explicitly defined entirely with respect to
other GO terms, whereas the third comprises classes in which the
regulated qualities are specified by terms from PATO (see below) or
anatomy ontologies.

All GO regulation terms use one of three relations, {\bf regulates},
{\bf negatively\_regulates} and {\bf positively\_regulates}, to link
regulation terms to process or quality terms. The {\bf regulates}
relations are defined in terms of qualities: a regulatory process
causes a change in magnitude to some quality, which in turn has an
effect on the frequency, rate or duration of some other type of
process. Effects that results in increases and decreases use {\bf
  positively\_regulates} and {\bf negatively\_regulates} respectively
\cite{Mungall2010go}. The existing ontology structure would also
support the addition of subclasses to distinguish, for example,
regulation of the rate of a process from regulation of its duration or
time of onset.

\subsection{PATO and the EQ model}
PATO was envisaged and designed to provide a platform for allowing the
integration of quantitative and qualitative phenotype related
information across different domains, levels of granularity and
species \cite{Gkoutos2005}.  PATO is an ontology of phenotype
qualities that form basic entities that we can perceive and/or measure
such as colors, sizes, rates etc. One of its classification axes is
based on the basic type of entity to which a qualities belongs, and
PATO distinguishes between qualities of physical objects and qualities
of processes.

PATO allows for the description of affected entities by combining
various ontologies that describe the entities that have been affected,
such as the various anatomical ontologies, GO 
the Cell Type Ontology \cite {Bard2005}, SO \cite {Eilbeck2005} etc
with the various qualities it provides for defining how these entities
were affected.  PATO can be used for annotation either directly in a
so called post-composed (post-coordinated) manner or for providing
formal (logical) definitions (equivalence axioms) to ontologies
containing a set of precomposed (pre-coordinated) phenotype terms. For
instance, to describe the decrease in the length of the sexual cycle
of female animals, we can combine the PATO term \textit{\DIFdelbegin \DIFdel{decreased
  }\DIFdelend \DIFaddbegin \DIFadd{Decreased
  }\DIFaddend duration} ({\tt PATO:0000499}) with the Gene Ontology term
\textit{\DIFdelbegin \DIFdel{estrous }\DIFdelend \DIFaddbegin \DIFadd{Estrous }\DIFaddend cycle} ({\tt GO:0042698}), whilst if such a term
existed in a pre-composed ontology (for example the {\tt MP:0009007}
term from the Mammalian Phenotype) it could be used to provide an
equivalence statement between that class and the above PATO-based
description.
\DIFaddbegin 

\subsection{\DIFadd{Axioms for physiology phenotypes}}
\DIFadd{We implement our theory of physiology phenotypes using OWL, a formal
language based on description logics. Using OWL, we formulate axioms
that can be used by automated reasoners to infer additional
information. Automated reasoning and the axioms we provide are
intended to satisfy three aims. First, we use the axioms to infer
information from background knowledge. In particular, we aim to
automatically generate a class hierarchy of physiology phenotypes when
an ontology of physiological processes, such as the GO, is provided as
background knowledge. Our second aim is to provide interoperability
with phenotype ontologies of other domains, including species-specific
phenotype ontologies that have been formalized using the EQ method
}\cite{Mungall2010, Gkoutos2005}\DIFadd{. Finally, our third aim is to query
physiology phenotypes based on physiological processes that are
affected or based on the way in which they are affected. 
}

\DIFadd{Our three aims rely on the possibility of using automated reasoning
over a resulting ontology of physiology phenotypes.  However, OWL is
an expressive formal language, and automated reasoning in OWL has a
high computational complexity (reasoning in OWL 2.0 is
2NEXPTIME-complete). Consequently, due to the exponential increase in
worst-case time complexity for automated reasoning in OWL, we would
not be guaranteed to achieve our aims once we consider more than very
few phenotype classes. In particular, using an ontology of the size of
GO, with more than 22,000 classes for processes, as a foundation for
constructing an ontology of physiology phenotypes would not allow us
to achieve our aims through automated reasoning.
}

\DIFadd{The OWL EL profile is a subset of OWL that significantly decreases the
expressivity of OWL and the resulting time complexity of automated
reasoning }\cite{owlprofiles}\DIFadd{. Highly efficient automated reasoners
have been developed for OWL EL which are capable of processing very
large ontologies }\cite{Kazakov2011}\DIFadd{. In order to achieve our aims and
utilize automated reasoning for large ontologies, we limit ourselves
to the OWL EL profile. As a consequence of the restriction to OWL EL,
we cannot make use of negation (}{\tt \DIFadd{not}}\DIFadd{), disjunction (}{\tt \DIFadd{or}}\DIFadd{),
universal quantification (}{\tt \DIFadd{forall}}\DIFadd{) and several other types of
operations in our axioms }\cite{owlprofiles}\DIFadd{.
}

\DIFadd{The lack of expressivity in OWL EL requires a formulation of axioms so
that the inferences we desire (i.e., the subclass relations resulting
in the ontology's taxonomy) are maintained without using features of
OWL that go beyond OWL EL expressivity. Consequently, we formulate
}{\em \DIFadd{abnormality}} \DIFadd{and }{\em \DIFadd{absence}} \DIFadd{similarly to current
formalizations of EQ-based phenotype ontologies, without the use of
negation, disjunction or universal quantification
}\cite{Mungall2010}\DIFadd{. A detailed description of the axioms we implement
is available as supplementary material (Supplement 1).
}\DIFaddend \end{methods}

\section{Results}
\subsection{Attributes of processes}
We develop a model of process attributes that is applicable for
representations of physiology and related phenotypes. In principle, we
distinguish between three different kinds of process attributes: the
first are process attributes that arise directly from processes and
include {\em \DIFdelbegin \DIFdel{duration}\DIFdelend \DIFaddbegin \DIFadd{Duration}\DIFaddend } and {\em \DIFdelbegin \DIFdel{temporal }\DIFdelend \DIFaddbegin \DIFadd{Temporal }\DIFaddend location}; the second are
attributes that arise from processes and their temporal parts and
include {\em \DIFdelbegin \DIFdel{frequency}\DIFdelend \DIFaddbegin \DIFadd{Frequency}\DIFaddend } and {\em \DIFdelbegin \DIFdel{onset}\DIFdelend \DIFaddbegin \DIFadd{Onset}\DIFaddend }; and the third are attributes
that arise from processes and qualities of their participants, and
include {\em \DIFdelbegin \DIFdel{flow rates}\DIFdelend \DIFaddbegin \DIFadd{Flow rate}\DIFaddend }\DIFaddbegin \DIFadd{s}\DIFaddend .

Attributes that can be directly linked to a process arise from
processes' temporal extension. For example, a duration is an attribute
that characterizes the temporal extent of a process and is similar to
{\em Length}, {\em Area} and {\em Volume} for one-, two- and
three-dimensional physical objects. A {\em Temporal location}
positions the time interval at which the process occurs with respect
to a reference coordinate system.

However, the majority of attributes that characterize processes are
not based on these types of process attributes alone, but rather
relate attributes of process participants with the duration of a
process. In particular, a {\em \DIFdelbegin \DIFdel{rate}\DIFdelend \DIFaddbegin \DIFadd{Rate}\DIFaddend } typically refers to an attribute
of some entity {\em with respect to an attribute of another entity},
and in the context of processes, rates often refer to attributes of a
process participant with respect to the duration of the process. For
example, a {\em \DIFdelbegin \DIFdel{mass }\DIFdelend \DIFaddbegin \DIFadd{Mass }\DIFaddend flow rate} refers to the {\em Mass} of a process
participant with respect to the \DIFdelbegin \DIFdel{duration }\DIFdelend \DIFaddbegin {\em \DIFadd{Duration}} \DIFaddend of the process, i.e.,
how much matter is moved (from one point to another) through the
process.  As a more complex example, a {\em \DIFdelbegin \DIFdel{rate }\DIFdelend \DIFaddbegin \DIFadd{Rate }\DIFaddend of change of
  position} refers to the {\em \DIFdelbegin \DIFdel{distance}\DIFdelend \DIFaddbegin \DIFadd{Distance}\DIFaddend } that an object is moved with
respect to the \DIFdelbegin \DIFdel{duration }\DIFdelend \DIFaddbegin {\em \DIFadd{Duration}} \DIFaddend of the process.

However, not all rates of a process depend on attributes of the
process participants. In particular, a {\em \DIFdelbegin \DIFdel{frequency }\DIFdelend \DIFaddbegin \DIFadd{Frequency }\DIFaddend of occurrence}
or {\em \DIFdelbegin \DIFdel{event }\DIFdelend \DIFaddbegin \DIFadd{Event }\DIFaddend rate} refers to the number of occurrences of a type of
process during a reference process. For example, a {\em \DIFdelbegin \DIFdel{rate }\DIFdelend \DIFaddbegin \DIFadd{Rate }\DIFaddend of heart
  beating} refers to the number of {\em Heart beating} processes that
occur within a reference process (e.g., a process in which the heart
participates with a duration of one minute). Further attributes that
depend on types of processes with regard to a reference process are
{\em distribution patterns}, i.e., how the occurrences of processes of
a particular type are distributed within a reference process. For
example, the heart may beat \DIFdelbegin %DIFDELCMD < {\em %%%
\DIFdel{rhythmically }%DIFDELCMD < } %%%
\DIFdel{or }%DIFDELCMD < {\em %%%
\DIFdel{arrhythmically }%DIFDELCMD < }
%DIFDELCMD < %%%
\DIFdelend \DIFaddbegin \DIFadd{rhythmically or arrhythmically }\DIFaddend within a
period of time (see Figure \ref{fig:patterns}).
\begin{figure}
  \centering
  \includegraphics[width=.5\textwidth]{processpatterns.eps}
  \caption{\label{fig:patterns}Six examples of processes with
    non-comparative and comparative process attributes.  We assume
    that the processes labelled $a$, $b$, $c$ and $d$ are all
    instances of the class of processes $P$.  On the left side, three
    regulation (of $P$) processes are illustrated which exhibit
    non-comparative attributes. The first process has an attribute of
    {\em \DIFdelbeginFL \DIFdelFL{rhythmic}%DIFDELCMD < } %%%
\DIFdelendFL \DIFaddbeginFL \DIFaddFL{Rhythmic }\DIFaddendFL occurrence of \DIFdelbeginFL \DIFdelFL{$P$ }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}} \DIFaddendFL because the instances of $P$ are
    temporally equidistantly distributed. The second example shows an
    {\em \DIFdelbeginFL \DIFdelFL{arrhythmic}%DIFDELCMD < } %%%
\DIFdelendFL \DIFaddbeginFL \DIFaddFL{Arrhythmic }\DIFaddendFL occurrence of \DIFdelbeginFL \DIFdelFL{$P$}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}}\DIFaddendFL , and the third examples shows an
    {\em \DIFdelbeginFL \DIFdelFL{increasing }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{Increasing }\DIFaddendFL frequency \DIFdelbeginFL %DIFDELCMD < } %%%
\DIFdelFL{(}\DIFdelendFL of \DIFdelbeginFL \DIFdelFL{$P$)}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}}\DIFaddendFL . A regulation process with an \DIFdelbeginFL \DIFdelFL{increasing }\DIFdelendFL \DIFaddbeginFL {\em
      \DIFaddFL{Increasing }\DIFaddendFL frequency \DIFdelbeginFL \DIFdelFL{(}\DIFdelendFL of \DIFdelbeginFL \DIFdelFL{$P$) }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}} \DIFaddFL{attribute }\DIFaddendFL is a process in which the
    \DIFdelbeginFL \DIFdelFL{frequency }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{value of the }{\em \DIFaddFL{Frequency }\DIFaddendFL of occurrences of \DIFdelbeginFL \DIFdelFL{$P$ }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}} \DIFaddFL{attribute }\DIFaddendFL is
    lower in the first half of the process than in the second
    half. The right side of the figure illustrates comparative
    phenotypic descriptions of processes. On the upper right, the {\em
      normal} reference \DIFaddbeginFL \DIFaddFL{process }\DIFaddendFL is shown. The second example
    illustrates a {\em \DIFdelbeginFL \DIFdelFL{late }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{Late }\DIFaddendFL onset \DIFdelbeginFL %DIFDELCMD < } %%%
\DIFdelendFL of \DIFdelbeginFL \DIFdelFL{$P$}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}}\DIFaddendFL , i.e., the attribute that $P$
    processes begin later than {\em normal} \DIFaddbeginFL \DIFaddFL{processes}\DIFaddendFL . Finally, the
    lower right illustrates a {\em \DIFdelbeginFL \DIFdelFL{decreased }\DIFdelendFL \DIFaddbeginFL \DIFaddFL{Decreased }\DIFaddendFL frequency \DIFdelbeginFL %DIFDELCMD < } %%%
\DIFdelFL{(}\DIFdelendFL of \DIFdelbeginFL \DIFdelFL{$P$)}\DIFdelendFL \DIFaddbeginFL \DIFaddFL{P}}\DIFaddendFL , since
    fewer processes of the type $P$ occur within the reference process
    than {\em normal}.}
\end{figure}

Related to distribution patterns are {\em changing qualities} of
processes. For example, the rate of heart beating may change ({\em
  increase} or {\em decrease}) throughout the course of a reference
process. A simple analysis of {\em increasing} ({\em decreasing})
rates would be that the rate of a heart beating within the first half
of a process is {\em lower} ({\em higher}) than in the second half of
the process. To make such an assertion, we divide a process into two
temporal parts. Mathematically, this process of sub-division can be
iterated until processes occur within infinitesimally small time
intervals.

While some processes can be subdivided indefinitely while retaining
certain kinds of attributes, others cannot.
% A class $C$ of {\em continuous
%   processes} is a class that has processes $p$ as instance such that
% all temporal parts of $p$ are also instances of $C$. 
Examples of processes that can be divided include {\em \DIFdelbegin \DIFdel{continuous
  movements}\DIFdelend \DIFaddbegin \DIFadd{Continuous
  movement}\DIFaddend } or {\em \DIFdelbegin \DIFdel{mass }\DIFdelend \DIFaddbegin \DIFadd{Mass }\DIFaddend flow} processes, for which all parts have a
{\em \DIFdelbegin \DIFdel{speed}\DIFdelend \DIFaddbegin \DIFadd{Speed}\DIFaddend } or {\em \DIFdelbegin \DIFdel{flow }\DIFdelend \DIFaddbegin \DIFadd{Flow }\DIFaddend rate} attribute. On the other hand, some
processes can be subdivided into stages of activity and stages of
inactivity \DIFaddbegin \DIFadd{(with respect to a particular process type) }\DIFaddend and cannot
arbitrarily be divided. For example, a process of {\em \DIFdelbegin \DIFdel{heart }\DIFdelend \DIFaddbegin \DIFadd{Heart }\DIFaddend beating}
has periods of activity \DIFdelbegin \DIFdel{(a single heart beat )
and inactivity}\DIFdelend \DIFaddbegin \DIFadd{in which a heart beat occurs and periods in
which no heart beat process occurs}\DIFaddend . Consequently, not all parts of the
process have a {\em \DIFdelbegin \DIFdel{heart }\DIFdelend \DIFaddbegin \DIFadd{Heart }\DIFaddend rate} \DIFaddbegin \DIFadd{(}{\em \DIFadd{Rate of heart beating}}\DIFadd{)
}\DIFaddend attribute.

We may further attribute a {\em \DIFdelbegin \DIFdel{frequency}\DIFdelend \DIFaddbegin \DIFadd{Frequency}\DIFaddend } or {\em \DIFdelbegin \DIFdel{rate}\DIFdelend \DIFaddbegin \DIFadd{Rate}\DIFaddend } to an object
instead of a process. For example, a heart that beats {\em now} with a
frequency of 80 beats per minute, or a car that is moving at a speed
of 180 kilometres per hour {\em at a particular point in time} (e.g.,
as observed with a speed camera) can be considered attributes of the
objects (the heart\DIFaddbegin \DIFadd{, }\DIFaddend or the car), not attributes of the processes in
which the objects participate. However, these are {\em different}
kinds of attributes. Rates, when considered as attributes of objects,
may be explicitly defined using rates of processes. For example, the
heart beating frequency of a particular heart $h$ at a time point $t$
is the frequency of a reference heart beating process in which $h$
participates. Such a reference process is necessary in order to obtain
a value for a frequency even when no {\em \DIFdelbegin \DIFdel{heart }\DIFdelend \DIFaddbegin \DIFadd{Heart }\DIFaddend beating} process is
occurring. However, the frequency is only an attribute of the heart in
virtue of such a reference process in which {\em \DIFdelbegin \DIFdel{heart }\DIFdelend \DIFaddbegin \DIFadd{Heart }\DIFaddend beating} is
actually occurring.  This reference process can be uniquely determined
for processes such as {\em \DIFdelbegin \DIFdel{continuous }\DIFdelend \DIFaddbegin \DIFadd{Continuous }\DIFaddend movement}, where the rate of an
object at a time $t$ is the rate of the infinitesimally small process
that occurs around $t$. The reference process is ambiguous for
processes such as {\em \DIFdelbegin \DIFdel{heart }\DIFdelend \DIFaddbegin \DIFadd{Heart }\DIFaddend beating}, and the reference process must
be explicitly stated.

% Figure \ref{fig:patterns} illustrates some examples of
% non-comparative and comparative process attributes.

% A similar construction as for continuous processes can be made for
% discrete processes. A rate of blood flow is immediately an attribute
% of a single heart beat process. However, during a period of time, the
% cumulative rate of blood flow can be observed as the sum of the blood
% flow rates of individual heart beating processes.


% Complex relations between Based on these definition patterns,
% interdependencies between process attributes can be observed.

\subsection{Cell Phenotype Ontology}
While our considerations about process attributes are only the
beginnings of a full-fledged theory, we have derived several phenotype
formalization patterns and a high-level taxonomic structure of
process-based phenotypes. To evaluate our approach, we created the
Cellular Phenotype Ontology (CPO) by automatically applying our
patterns to the GO.

% CPO contains both structural abnormalities and physiological
% abnormalities of cells.
Phenotypes in the CPO are either based on structural abnormalities or
abnormal physiology involving cells or cell components. Structural
abnormalities in the CPO are based on GO's Cellular Component (GO-CC)
hierarchy. GO-CC contains 2,918 classes for cell parts (including {\em
  Cell}) and extracellular components of cells. For each cellular
component class $C$ in the GO-CC, we create a new class labelled {\em
  $C$ phenotype} in the CPO. For example, for the class {\em
  Mitochondrion} ({\tt GO:0005739}) in the GO-CC, we create the class
{\em Mitochondrion phenotype}.

Amongst the structural phenotype classes, we first distinguish between
{\em normal} and {\em abnormal} phenotypes. An {\em Abnormal phenotype
  of $C$} is a phenotype of an organism that does not have a normal
$C$ as part, while a {\em Normal phenotype of $C$} represents the
state in which an organism has a {\em normal $C$} as part.

Amongst the abnormal phenotypes that we include for all cell
components listed in GO-CC, we distinguish {\em \DIFdelbegin \DIFdel{abnormal }\DIFdelend \DIFaddbegin \DIFadd{Abnormal }\DIFaddend morphology}
and {\em \DIFdelbegin \DIFdel{abnormal }\DIFdelend \DIFaddbegin \DIFadd{Abnormal }\DIFaddend physiology} phenotypes. An {\em Abnormal morphology
  of $C$} is either the (abnormal) absence of required parts of $C$,
the (abnormal) presence of additional parts, or abnormal qualities of
$C$ \cite{Hoehndorf2010phene}. For example, an \DIFdelbegin \DIFdel{absence of
  caveolae}\DIFdelend \DIFaddbegin {\em \DIFadd{Absence of
  caveolae}} \DIFaddend ({\tt MP:0004150}) would be a subclass of {\em Abnormal
  morphology of plasma membrane} in virtue of caveolae necessarily
being part of the {\em Plasma membrane} ({\tt GO:0005886}).

Abnormal physiology of a cell component refers to abnormal {\em
  functionality} of a cell component. We assume that a functionality
of a cell component is (the potential for) a process in which the cell
component is (causally) involved. We use the definitions of GO classes
that were created based on lexical decompositions of GO class labels
\cite{Mungall2010go, Bada2007a, Ogren2004} to identify the processes
in which cell components are involved. For example, the definition of
the GO class {\em Mitochondrial fission} ({\tt GO:0000266}) is
explicitly defined as an {\em Organelle fission} ({\tt GO:0048285})
that {\bf results-in-the-division-of} a {\em Mitochondrion} ({\tt
  GO:0005739}). Based on this definition, we make the assumption that
{\em Mitochondrial fission} is one of the functions of a {\em
  Mitochondrion} and that an {\em Abnormality of mitochondrial
  fission} is a subclass of an {\em Abnormality of mitochondrion
  physiology}.

Amongst abnormal physiology, we distinguish between abnormalities in a
{\em single occurrence} of a cell component's functioning and an
abnormal {\em pattern of multiple occurrences} of a cell component's
functioning. For example, abnormalities in cell division resulting in
{\em Aneuploidy} refer to abnormalities of {\em \DIFdelbegin \DIFdel{cell }\DIFdelend \DIFaddbegin \DIFadd{Cell }\DIFaddend division}
processes, while an {\em \DIFdelbegin \DIFdel{increased }\DIFdelend \DIFaddbegin \DIFadd{Increased }\DIFaddend rate of cell division} refers to an
abnormality in the pattern of occurrence of multiple cell division
processes. In the CPO we follow the GO and represent abnormalities in
the pattern of occurrence of $X$ as abnormalities of {\em \DIFdelbegin \DIFdel{regulation
  }\DIFdelend \DIFaddbegin \DIFadd{Regulation
  }\DIFaddend of $X$} processes.  In particular, an {\em \DIFdelbegin \DIFdel{increased }\DIFdelend \DIFaddbegin \DIFadd{Increased }\DIFaddend rate of cell
  division} is not an attribute of {\em \DIFdelbegin \DIFdel{cell }\DIFdelend \DIFaddbegin \DIFadd{Cell }\DIFaddend division} processes, but
rather \DIFdelbegin \DIFdel{an }\DIFdelend \DIFaddbegin \DIFadd{arises from the collection of all }{\em \DIFadd{Cell division}} \DIFadd{processes
that occur within an organism at a given time. In the GO, }{\em
  \DIFadd{Regulation of $X$}} \DIFadd{processes refer to those processes that determine
how often and in which way one or more $X$ processes occur. Therefore,
we assign the }\DIFaddend attribute of \DIFdelbegin \DIFdel{the }\DIFdelend {\em \DIFdelbegin \DIFdel{regulation }\DIFdelend \DIFaddbegin \DIFadd{Increased rate of cell division}} \DIFadd{to
}{\em \DIFadd{Regulation }\DIFaddend of cell division} \DIFaddbegin \DIFadd{processes}\DIFaddend .

Single occurrences of processes can be abnormal in multiple ways,
depending on the type of process.
%
First, common to all processes is the quality of {\em \DIFdelbegin \DIFdel{duration}\DIFdelend \DIFaddbegin \DIFadd{Duration}\DIFaddend } and
consequently, each process can have an {\em abnormal} (increased or
decreased) duration. For example, a part of an organism may
participate in an {\em Inflammatory response} ({\tt GO:0006954}) that
lasts longer than normal, i.e., the organism has an {\em Increased
  duration of inflammatory response} phenotype. We define such a
phenotype as a phenotype of an organism which has a part that
participates in {\em Inflammatory response}, and this {\em
  Inflammatory response} process has an {\em Increased duration} ({\tt
  PATO:0000498}).

The second common type of abnormality are abnormalities based on
process participants in relation to the duration of the process. These
include all kinds of {\em rates} such as {\em \DIFdelbegin \DIFdel{mass }\DIFdelend \DIFaddbegin \DIFadd{Mass }\DIFaddend flow rate}, {\em
  \DIFdelbegin \DIFdel{energy }\DIFdelend \DIFaddbegin \DIFadd{Energy }\DIFaddend flow rate} and {\em \DIFdelbegin \DIFdel{velocity}\DIFdelend \DIFaddbegin \DIFadd{Velocity}\DIFaddend } (the rate of change of
position). In each of these cases, an object participates in a process
and a quality (or change of quality) of that object throughout the
duration of the process is considered to form a new quality. If the
process has participants that are distinguished into {\em inputs} and
{\em outputs}, then a recurring pattern is that the amount of inputs
or outputs that participate in the process can be {\em increased} or
{\em decreased}. For example, an {\em Increased rate of cytoplasmic
  streaming} can be defined as an increased amount of inputs or an
increased amount of outputs of a {\em \DIFdelbegin \DIFdel{cytoplasmic }\DIFdelend \DIFaddbegin \DIFadd{Cytoplasmic }\DIFaddend streaming} process.

Finally, some objects may be divided into stages during which
particular states of affairs obtain, and a process may be abnormal in
that these states of affairs do not obtain at a particular
stage. Notably, at the beginning and the end of a process, pre- and
post-conditions may obtain that are abnormally changed in a
process. For example, {\em Aneuploidy} -- an abnormality during cell
division at which the chromosomes do not separate properly between the
two cells -- may be considered the result of such an abnormality.

We implement the first two types of abnormality in the CPO. First, as
a subclass of each {\em Abnormality of P} class, we create {\em
  Abnormal duration of P}, which in turn has {\em Increased duration
  of P} and {\em Decreased duration of P} as subclasses. Second, if we
are able to identify {\em inputs} $I(P)$ or {\em outputs} $O(P)$ of
the process $P$ in the formal definitions of the GO, we automatically
generate {\em Abnormality of $I(P)$ in $P$} as well as {\em
  Abnormality of $O(P)$ in $P$}.  The left side of Figure
\ref{fig:overview} illustrates the schema of classes we generate for
single process abnormalities.
\begin{figure*}
  \centering
  \includegraphics[width=1\textwidth]{overview.eps}
%  \includegraphics[width=1\textwidth]{overview.pdf}
  \caption{Overview over the taxonomic structure of CPO. The structure
    is based on a cellular component class $X$ and the cellular
    processes $P(X)$ in which $X$ is involved.\label{fig:overview}}
\end{figure*}

The second type of abnormality in the CPO relate to abnormalities of
{\em multiple} occurrences of some process $X$. According to the GO,
{\em regulation} processes are processes that maintain or modify the
occurrence of processes of a particular type. Following this
convention, we call an abnormality of multiple occurrences of $X$ {\em
  \DIFdelbegin \DIFdel{abnormality }\DIFdelend \DIFaddbegin \DIFadd{Abnormality }\DIFaddend of the regulation of $X$}.

\DIFdelbegin %DIFDELCMD < \begin{table*}
%DIFDELCMD <   \centering
%DIFDELCMD <   \begin{tabular}{p{2.8cm}|p{4cm}|p{4cm}|p{4cm}}
%DIFDELCMD <     & %%%
\DIFdel{Increased flow rate }%DIFDELCMD < & %%%
\DIFdel{Normal flow rate }%DIFDELCMD < & %%%
\DIFdel{Decreased
    flow rate }%DIFDELCMD < \\
%DIFDELCMD <     \hline
%DIFDELCMD <     %%%
\DIFdel{Increased frequency }%DIFDELCMD < &%%%
\DIFdel{increased total flow rate }%DIFDELCMD < &%%%
\DIFdel{increased total
    flow rate }%DIFDELCMD < &%%%
\DIFdel{?}%DIFDELCMD < \\
%DIFDELCMD <     %%%
\DIFdel{Normal frequency }%DIFDELCMD < &%%%
\DIFdel{increased total flow rate }%DIFDELCMD < &%%%
\DIFdel{normal total flow
    rate }%DIFDELCMD < &%%%
\DIFdel{decreased total flow rate}%DIFDELCMD < \\
%DIFDELCMD <     %%%
\DIFdel{Decreased frequency }%DIFDELCMD < &%%%
\DIFdel{?}%DIFDELCMD < &%%%
\DIFdel{decreased total flow rate }%DIFDELCMD < &%%%
\DIFdel{decreased total
    flow rate}%DIFDELCMD < \\
%DIFDELCMD <     \hline
%DIFDELCMD <   \end{tabular}
%DIFDELCMD <   %%%
%DIFDELCMD < \caption{%
{%DIFAUXCMD
%DIFDELCMD < \label{tbl:flow}%%%
\DIFdel{Interdependency for the attribute }%DIFDELCMD < {\em
%DIFDELCMD <       %%%
\DIFdel{Total cytoplasmic flow rate}%DIFDELCMD < }%%%
\DIFdel{. A }%DIFDELCMD < {\em %%%
\DIFdel{Total cytoplasmic flow
      rate}%DIFDELCMD < } %%%
\DIFdel{is an attribute of }%DIFDELCMD < {\em %%%
\DIFdel{Regulation of cytoplasmic
      streaming}%DIFDELCMD < } %%%
\DIFdel{processes, while }%DIFDELCMD < {\em %%%
\DIFdel{Cytoplasmic flow rate}%DIFDELCMD < } %%%
\DIFdel{is an
    attribute of individual }%DIFDELCMD < {\em %%%
\DIFdel{cytoplasmic streaming}%DIFDELCMD < }
%DIFDELCMD <     %%%
\DIFdel{processes. Depending both on whether the cytoplasmic flow rate in
    individual }%DIFDELCMD < {\em %%%
\DIFdel{cytoplasmic streaming}%DIFDELCMD < } %%%
\DIFdel{processes is increased or
    decreased and whether the frequency of occurrence of }%DIFDELCMD < {\em
%DIFDELCMD <       %%%
\DIFdel{cytoplasmic streaming}%DIFDELCMD < } %%%
\DIFdel{is increased or decreased, the total
    cytoplasmic flow rate can be increased or decreased.}}
%DIFAUXCMD
%DIFDELCMD < \end{table*}
%DIFDELCMD < %%%
\DIFdelend %DIF >  \begin{table*}
%DIF >    \centering
%DIF >    \begin{tabular}{p{2.8cm}|p{4cm}|p{4cm}|p{4cm}}
%DIF >      & Increased flow rate & Normal flow rate & Decreased
%DIF >      flow rate \\
%DIF >      \hline
%DIF >      Increased frequency &increased total flow rate &increased total
%DIF >      flow rate &?\\
%DIF >      Normal frequency &increased total flow rate &normal total flow
%DIF >      rate &decreased total flow rate\\
%DIF >      Decreased frequency &?&decreased total flow rate &decreased total
%DIF >      flow rate\\
%DIF >      \hline
%DIF >    \end{tabular}
%DIF >    \caption{\label{tbl:flow}Interdependency for the attribute {\em
%DIF >        Total cytoplasmic flow rate}. A {\em Total cytoplasmic flow
%DIF >        rate} is an attribute of {\em Regulation of cytoplasmic
%DIF >        streaming} processes, while {\em Cytoplasmic flow rate} is an
%DIF >      attribute of individual {\em cytoplasmic streaming}
%DIF >      processes. Depending both on whether the cytoplasmic flow rate in
%DIF >      individual {\em cytoplasmic streaming} processes is increased or
%DIF >      decreased and whether the frequency of occurrence of {\em
%DIF >        cytoplasmic streaming} is increased or decreased, the total
%DIF >      cytoplasmic flow rate can be increased or decreased.}
%DIF >  \end{table*}

A first kind of abnormality of regulatory processes are {\em abnormal
  temporal distribution patterns} of a process. In these
abnormalities, the {\em way} in which processes of a particular kind
are temporally distributed is abnormal.  The most common abnormal
distribution pattern is an increased or decreased frequency, and we
use PATO's {\em frequency} class to define {\em Abnormal frequency of
  occurrence of $X$}.
% \begin{verbatim}
% has-phenotype some (has-part-some (participates-in some 
%   (regulates some X and has-quality some (frequency and towards some X))))
% \end{verbatim}
For example, an {\em Abnormal frequency of occurrence of apoptosis} is
defined as an abnormality of {\em Regulation of apoptosis} ({\tt
  GO:0042981}) with respect to the {\em \DIFdelbegin \DIFdel{frequency}\DIFdelend \DIFaddbegin \DIFadd{Frequency}\DIFaddend } ({\tt
  PATO:0000044}) of {\em Apoptosis} ({\tt GO:0006915}) occurrences.

There are further types of deviation from a distribution pattern. For
example, a kind of process that is normally {\em rhythmic} can be
abnormal in that it is {\em arrhythmic}. A typical example of this
kind of process is {\em Heart beating} ({\tt GO:0060047}), in which
{\em Cardiac muscle contraction} ({\tt GO:0060048}) processes occur in
a rhythmic pattern. In {\em Cardiac dysrhythmia}, however, {\em
  Cardiac muscle contraction} processes occur arrhythmically, and we
consider this to be an abnormality of the regulation of {\em Cardiac
  muscle contraction}. While these abnormalities are often highly
informative in clinical diagnostics and biological investigations, we
usually lack the necessary information that is required to
automatically determine meaningful types of abnormal distribution
patterns.

A second kind of regulatory abnormalities is related to the {\em
  onset} of a process. With respect to a reference process, a
particular kind of process may be {\em \DIFdelbegin \DIFdel{delayed}\DIFdelend \DIFaddbegin \DIFadd{Delayed}\DIFaddend } ({\tt PATO:0000502})
or {\em \DIFdelbegin \DIFdel{premature}\DIFdelend \DIFaddbegin \DIFadd{Premature}\DIFaddend } ({\tt PATO:0000694}). For example, {\em Delayed
  apoptosis} refers to an abnormality of the {\em Regulation of
  apoptosis} in which apoptosis is induced later than normal.  We use
the PATO quality {\em \DIFdelbegin \DIFdel{onset}\DIFdelend \DIFaddbegin \DIFadd{Onset}\DIFaddend } ({\tt PATO:0002325}) and its children
{\em \DIFdelbegin \DIFdel{delayed}\DIFdelend \DIFaddbegin \DIFadd{Delayed}\DIFaddend } and {\em \DIFdelbegin \DIFdel{premature}\DIFdelend \DIFaddbegin \DIFadd{Premature}\DIFaddend } to define these types of regulatory
abnormality. Similarly, we use PATO's {\em \DIFdelbegin \DIFdel{offset}\DIFdelend \DIFaddbegin \DIFadd{Offset}\DIFaddend } ({\tt
  PATO:0002324}) quality and its children to characterize regulatory
abnormalities in which a process ends prematurely or too late.

Finally, a third kind of regulatory abnormality refers to abnormal
rates with respect to a participant of the process that is being
regulated. For example, a cytoplasmic flow rate can be increased or
decreased not within a single {\em \DIFdelbegin \DIFdel{cytoplasmic }\DIFdelend \DIFaddbegin \DIFadd{Cytoplasmic }\DIFaddend streaming} process but
rather the total cytoplasmic flow rate, as a summation over all
cytoplasmic streaming processes that occur within an organism (or a
particular anatomical location), is increased or decreased. While a
flow rate of a single {\em \DIFdelbegin \DIFdel{cytoplasmic }\DIFdelend \DIFaddbegin \DIFadd{Cytoplasmic }\DIFaddend streaming} process is a quality
of that process, an increased {\em total} cytoplasmic flow rate is a
quality of the regulation of {\em \DIFdelbegin \DIFdel{cytoplasmic }\DIFdelend \DIFaddbegin \DIFadd{Cytoplasmic }\DIFaddend streaming}. In
particular, it is possible for an organism to have a normal --- or
even a decreased --- cytoplasmic flow rate in each individual
cytoplasmic streaming process while at the same time having an
increased total cytoplasmic flow rate due to a large increase in the
frequency of occurrence of cytoplasmic streaming processes. Similarly,
the frequency of occurrence of cytoplasmic streaming may be normal or
decreased while the total cytoplasmic flow rate is increased due to an
increased cytoplasmic flow rate in each individual cytoplasmic
streaming process. 
\DIFdelbegin \DIFdel{Table \ref{tbl:flow} illustrates the dependencies
between rates of individual processes, their frequency of occurrence
and the total rate of these processes. }\DIFdelend %DIF >  A {\em Total cytoplasmic flow rate} is an attribute of {\em Regulation
%DIF >    of cytoplasmic streaming} processes, while {\em Cytoplasmic flow
%DIF >    rate} is an attribute of individual {\em cytoplasmic streaming}
%DIF >  processes. Depending both on whether the cytoplasmic flow rate in
%DIF >  individual {\em cytoplasmic streaming} processes is increased or
%DIF >  decreased and whether the frequency of occurrence of {\em cytoplasmic
%DIF >    streaming} is increased or decreased, the total cytoplasmic flow
%DIF >  rate can be increased or decreased.
We include total rates as \DIFaddbegin \DIFadd{subclasses of }\DIFaddend regulatory abnormalities in
the CPO since these are the attributes of processes that are often
measured or observed, while the rates of individual processes are
inferred\DIFdelbegin \DIFdel{following a schema such as Table
\ref{tbl:flow}}\DIFdelend .

\subsection{Implementation}
We were faced with two choices for implementing the CPO: we could
either implement a pre-composed ontology in which all classes and
their definitions are pre-generated according to the patterns we
define, or we could develop an annotation software that enables the
selection of our process phenotype patterns based on the current
structure of the GO.  To maximize the utility and compatibility of the
CPO, and to provide stable identifiers for its concepts, we selected
the first strategy and developed a software to automatically generate
a pre-composed ontology from the GO.

We developed a software that utilizes the OWL API \cite{Horridge2007}
in order to generate an OWL representation of the CPO. The software
requires three input files: a version of the GO on which to base the
generated CPO, a version of PATO that is used to define abnormal
qualities, and a copy of the GO cross-product definitions
\cite{Mungall2010go} that is used to relate cell components to the
processes in which they participate as well as identify the
participants, inputs and outputs of processes.

We automatically generate a unique numerical identifier for each class
in the CPO.  Since the CPO is based on the GO and need to be updated
with subsequent versions of the GO, we must ensure to keep identifiers
stable in subsequent versions of CPO. Therefore, we use the
identifiers for GO classes to generate CPO class identifiers.

In the CPO, identifiers contain two components and are of the form
{\tt CPO:XXGGGGGGG}, where {\tt GGGGGGG} is the seven-digit identifier
of the GO class on which the CPO class is based, and {\tt XX} is a
prefix that identifies the type of phenotype pattern that is applied
to the GO class. For example, based on the class {\em Apoptosis} ({\tt
  GO:0006915}), we generate the CPO classes {\em Abnormality of
  Apoptosis}, {\em Abnormality of single occurrence of apoptosis} and
{\em Abnormality of regulation of apoptosis}.  We use the prefixes
{\tt 12}, {\tt 14} and {\tt 15} for each of the corresponding
phenotype patterns, and consequently generate the class identifiers
{\tt CPO:120006915}, {\tt CPO:140006915} and {\tt CPO:150006915}. As
long as the GO maintains its identifier for the {\em Apoptosis} class,
the identifiers in the CPO will remain stable even when it is
regenerated.

% \begin{table*}
%   \centering
%   \begin{tabular}{|p{7cm}|l|p{4cm}|}
%     Neumann et al.\cite{Neumann2010}&Schmitz et
%     al.\cite{Schmitz2010}&Fuchs et al.\cite{Fuchs2010}\\
%     \hline
%     binuclear & normal mitotic exit & actin fiver cells\\
%     cell death & prolonged mitotic exit & big cells \\
%     cell migration & & bright and large cells phenotype \\
%     condensation followed by decondensation without completion of
%     mitosis & & bright nuclei\\
%     condensation without mitosis/collapse of nucleus & & cells with protrusions\\
%     dynamic changes & & elongated cells\\
%     failure in decondensation & & elongated cells with protrusions\\
%     grape& & high actin ratio cells \\
%     increased proliferation& & lamellipodia + high actin ratio cells\\
%     large& & lamellipodia cells\\
%     large nucleus& & large cells\\
%     metaphase alignment problems/including no metaphase& & large nuclei\\
%     metaphase delay/arrest& & low eccentricity cells\\
%     migration (distance)& & metaphase cells\\
%     migration (speed)& & proliferating cells\\
%     mitotic delay/arrest& & small cells\\
%     nuclei stay close together& & small cells with an enrichment of mitotic cells\\
%     polylobed& &\\
%     pulsating nuclei& &\\
%     segregation problems/chromatin bridges/lagging chromosomes/multiple DNA masses&&\\
%     small nucleus&&\\
%     strange nuclear shape&&\\
%     \hline
%   \end{tabular}
%   \caption{\label{tbl:studies}The table summarizes cellular phenotype
%     terms used in three recent systems microscopy studies.}
% \end{table*}

We use the labels of GO classes to automatically generate class labels
for phenotype classes as well as textual definitions for classes in
the CPO. For example, the label of the class for increased number of
occurrences of {\em Apoptosis} is {\em Increased frequency of
  occurrences of Apoptosis}, and its textual definition states that an
increased frequency of occurrences of {\em Apoptosis} is a phenotype
of {\em Regulation of apoptosis} in which the number of occurrences of
{\em Apoptosis} within a given time period is increased in comparison
to a reference process that is considered {\em normal}.

As of November 2011, CPO contains 125,466 classes of which 79,236 are
explicitly defined.  The ELK reasoner \cite{Kazakov2011} is able to
perform a classification of the ontology in under 10 seconds. We make
the ontology and the source code that is used to generate it freely
available on \url{http://cell-phenotype.googlecode.com}.

\section{Discussion}
\subsection{Applications of the CPO}
The Fission Yeast Phenotype Ontology (FYPO), a new ontology developed
to support annotation of phenotypes in {\em Schizosaccharomyces
  pombe}, consists of pre-composed terms describing normal or abnormal
cellular phenotypes. Over 80\% of FYPO definitions reference
descendants of GO-BP's {\em Cellular process} as the entity; a further
11\% reference GO-CC terms. All FYPO explicit definitions reference
qualities in PATO, including {\em normal}, {\em abnormal}, and several
process qualities including {\em \DIFdelbegin \DIFdel{increased }\DIFdelend \DIFaddbegin \DIFadd{Increased }\DIFaddend duration} and {\em
  \DIFdelbegin \DIFdel{decreased }\DIFdelend \DIFaddbegin \DIFadd{Decreased }\DIFaddend occurrence}. FYPO will thus fit neatly under the CPO
umbrella, and stands to benefit from the automated synchronization
between CPO and GO, as well as the integration of cellular phenotypes
across species that the CPO can provide. {\em Schizosaccharomyces
  pombe} annotations to FYPO terms will provide a rich body of highly
specific, well-supported data to be integrated with data from other
species.

A further domain that will greatly benefit from the CPO is {\em
  systems microscopy}, which aims to understand complex and dynamic
cellular systems by combining automated fluorescence microscopy, cell
microarray platforms, quantitative image analysis and data mining
\cite{Lock2010}.  If we consider some of the studies, which have been
published in this field in the last few years \DIFdelbegin %DIFDELCMD < \cite{Neumann2010,
%DIFDELCMD <   Schmitz2010, Fuchs2010}%%%
\DIFdelend \DIFaddbegin \cite{Neumann2010,
  Schmitz2010short, Fuchs2010}\DIFaddend , the need for CPO becomes evident.  In the
three studies, live-cell imaging assays and RNAi knockdown were used
to generate phenotypic profiles that quantify the cellular response to
a given siRNA thus allowing identification of hundreds of genes
involved in diverse biological functions including cell division,
migration and survival.  In each study, several phenotypes were
detected and described by the authors without the use of ontologies or
controlled vocabularies, making the integration between datasets
extremely difficult. For example, it is evident that cell division
phenotypes were observed in all three datasets and referred to by
terms such as {\em Mitotic delay/arrest}, {\em Prolonged mitotic
  exit}, {\em Methaphase delay} and {\em Methaphase cells}). Without a
controlled vocabulary of cellular phenotypes, the overlap between such
phenotype descriptions is unclear.

Data integration is also complicated by the lack of standardization at
the level of data production and processing; all these issues are
currently being address by the different groups involved in the
Systems Microscopy Network of Excellence
(http://www.systemsmicroscopy.eu/) and the first step towards data
integration can be achieved by further developing CPO.
\DIFdelbegin %DIFDELCMD < 

%DIFDELCMD < %%%
\DIFdelend This ontology will be used to integrate phenotypes' definitions across
existing datasets and will then become an integrated part of the data
processing pipeline and used to annotate the data as it gets generated
\cite{Conrad2011}.

\subsection{Future research}
Our main contribution is an analysis of process phenotypes that are
used across multiple domains and scales and which are crucial for
understanding and representing physiology of living systems. The
Ontology of Physics in Biology (OPB) \cite{opb} is an ontology that
has recently been proposed to characterize physiological processes and
the physical qualities of biological entities based on a theory of
fluid dynamics. It is an important goal for future research to
incorporate the OPB in phenotypic descriptions and make our theory of
process phenotypes compatible with the physical descriptions of
processes and their attributes as outlined by the OPB.

We implemented the CPO using a pattern-based approach to formulating
phenotypes involving processes. The patterns we identify are based on
pre-existing ontologies, in particular the PATO ontology and the
classification of cellular processes as well as cellular components in
the GO. The result of our method is a large ontology in which classes
for phenotypes are {\em pre-composed}: they are named and defined
within an OWL ontology. However, the large size of the resulting
ontology may impair its utility for data annotation and integration,
and software tools may not always support such very large
ontologies. The alternative to pre-composing all possible phenotype
classes using the patterns we describe is to dynamically generate
appropriately defined classes at the time at which they are being
used. To achieve this goal, software must be developed to support
ontology users in applying these patterns and generate the appropriate
class description when required.

A further important task is to develop the theory we outlined and
applied for the CPO. In particular, a precise formal characterization
of this theory in terms of axioms will further improve the clarity of
phenotypic descriptions of processes and enable its integration in
well-developed formal ontologies of processes \cite{Herre2006,
  Gruninger2010}.

\section{Acknowledgement}
\DIFaddbegin \DIFadd{We thank Michel Dumontier for valuable discussions about the formal
representation of the axioms and the different possible
interpretations of the }{\bf \DIFadd{regulates}} \DIFadd{relation.
}

\DIFaddend Funding for RH is provided by the European Commission's 7th Framework
Programme, RICORDO project, grant number 248502. MAH is supported by
Wellcome Trust, grant WT090548MA. HH is supported by the Institute for
Medical Informatics, Statistics and Epidemiology, University of
Leipzig.  Funding for GR is provided by the European Union's Seventh
Framework Programme (FP7/2007-2013) under grant agreement number
258068, EU-FP7-Systems Microscopy NoE. 

\DIFdelbegin %DIFDELCMD < \bibliographystyle{natbib}
%DIFDELCMD < %%%
%DIF <  \begin{thebibliography}{}
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Ashburner {\em et~al.}(2000)Ashburner, Ball, Blake, Botstein, Butler,
%DIF <    Cherry, Davis, Dolinski, Dwight, Eppig, Harris, Hill, Issel-Tarver,
%DIF <    Kasarskis, Lewis, Matese, Richardson, Ringwald, Rubin, and
%DIF <    Sherlock]{Ashburner2000b}
%DIF <  Ashburner, M., Ball, C.~A., Blake, J.~A., Botstein, D., Butler, H., Cherry,
%DIF <    J.~M., Davis, A.~P., Dolinski, K., Dwight, S.~S., Eppig, J.~T., Harris,
%DIF <    M.~A., Hill, D.~P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese,
%DIF <    J.~C., Richardson, J.~E., Ringwald, M., Rubin, G.~M., and Sherlock, G.
%DIF <    (2000).
%DIF <  \newblock Gene ontology: tool for the unification of biology. the gene ontology
%DIF <    consortium.
%DIF <  \newblock {\em Nat Genet\/}, {\bf 25}(1), 25--29.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Bada and Hunter(2007)Bada and Hunter]{Bada2007a}
%DIF <  Bada, M. and Hunter, L. (2007).
%DIF <  \newblock Enrichment of obo ontologies.
%DIF <  \newblock {\em Journal of Biomedical Informatics\/}, {\bf 40}(3), 300--315.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Bada {\em et~al.}(2004)Bada, Stevens, Goble, Gil, Ashburner, Blake,
%DIF <    Cherry, Harris, and Lewis]{Bada2004}
%DIF <  Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner, M., Blake, J.~A., Cherry,
%DIF <    M.~J., Harris, M., and Lewis, S. (2004).
%DIF <  \newblock A short study on the success of the gene ontology.
%DIF <  \newblock {\em Web Semantics: Science, Services and Agents on the World Wide
%DIF <    Web\/}, {\bf 1}(2), 235--240.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Bard {\em et~al.}(2005)Bard, Rhee, and Ashburner]{Bard2005}
%DIF <  Bard, J., Rhee, S.~Y., and Ashburner, M. (2005).
%DIF <  \newblock An ontology for cell types.
%DIF <  \newblock {\em Genome Biology\/}, {\bf 6}(2).
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Berners-Lee {\em et~al.}(2001)Berners-Lee, Hendler, Lassila, {\em
%DIF <    et~al.}]{Berners-Lee2001}
%DIF <  Berners-Lee, T., Hendler, J., Lassila, O., {\em et~al.} (2001).
%DIF <  \newblock {The Semantic Web}.
%DIF <  \newblock {\em Scientific American\/}, {\bf 284}(5), 28--37.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Conrad {\em et~al.}(2011)Conrad, W\"{u}nsche, Tan, Bulkescher,
%DIF <    Sieckmann, Verissimo, Edelstein, Walter, Liebel, Pepperkok, and
%DIF <    Ellenberg]{Conrad2011}
%DIF <  Conrad, C., W\"{u}nsche, A., Tan, T. H.~H., Bulkescher, J., Sieckmann, F.,
%DIF <    Verissimo, F., Edelstein, A., Walter, T., Liebel, U., Pepperkok, R., and
%DIF <    Ellenberg, J. (2011).
%DIF <  \newblock Micropilot: automation of fluorescence microscopy-based imaging for
%DIF <    systems biology.
%DIF <  \newblock {\em Nature methods\/}, {\bf 8}(3), 246--249.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Cook {\em et~al.}(2011)Cook, Bookstein, and Gennari]{opb}
%DIF <  Cook, D.~L., Bookstein, F.~L., and Gennari, J.~H. (2011).
%DIF <  \newblock Physical properties of biological entities: An introduction to the
%DIF <    ontology of physics for biology.
%DIF <  \newblock {\em PLoS ONE\/}, {\bf 6}(12), e28708.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Eilbeck {\em et~al.}(2005)Eilbeck, Lewis, Mungall, Yandell, Stein,
%DIF <    Durbin, and Ashburner]{Eilbeck2005}
%DIF <  Eilbeck, K., Lewis, S.~E., Mungall, C.~J., Yandell, M., Stein, L., Durbin, R.,
%DIF <    and Ashburner, M. (2005).
%DIF <  \newblock The sequence ontology: A tool for the unification of genome
%DIF <    annotations.
%DIF <  \newblock {\em Genome Biology\/}, {\bf 6}(R55).
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Engel {\em et~al.}(2010)Engel, Balakrishnan, Binkley, Christie,
%DIF <    Costanzo, Dwight, Fisk, Hirschman, Hitz, Hong, Krieger, Livstone, Miyasato,
%DIF <    Nash, Oughtred, Park, Skrzypek, Weng, Wong, Dolinski, Botstein, and
%DIF <    Cherry]{ypo}
%DIF <  Engel, S.~R., Balakrishnan, R., Binkley, G., Christie, K.~R., Costanzo, M.~C.,
%DIF <    Dwight, S.~S., Fisk, D.~G., Hirschman, J.~E., Hitz, B.~C., Hong, E.~L.,
%DIF <    Krieger, C.~J., Livstone, M.~S., Miyasato, S.~R., Nash, R., Oughtred, R.,
%DIF <    Park, J., Skrzypek, M.~S., Weng, S., Wong, E.~D., Dolinski, K., Botstein, D.,
%DIF <    and Cherry, J.~M. (2010).
%DIF <  \newblock {Saccharomyces Genome Database provides mutant phenotype data}.
%DIF <  \newblock {\em Nucleic Acids Research\/}, {\bf 38}(suppl 1), D433--D436.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Fuchs {\em et~al.}(2010)Fuchs, Pau, Kranz, Sklyar, Budjan, Steinbrink,
%DIF <    Horn, Pedal, Huber, and Boutros]{Fuchs2010}
%DIF <  Fuchs, F., Pau, G., Kranz, D., Sklyar, O., Budjan, C., Steinbrink, S., Horn,
%DIF <    T., Pedal, A., Huber, W., and Boutros, M. (2010).
%DIF <  \newblock Clustering phenotype populations by genome-wide {RNAi} and
%DIF <    multiparametric imaging.
%DIF <  \newblock {\em Molecular Systems Biology\/}, {\bf 6}.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[{Gene Ontology Consortium}(2010){Gene Ontology Consortium}]{go2010}
%DIF <  {Gene Ontology Consortium} (2010).
%DIF <  \newblock The gene ontology in 2010: extensions and refinements.
%DIF <  \newblock {\em Nucleic acids research\/}, {\bf 38}(Database issue), D331--335.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Gkoutos and Hoehndorf(2011)Gkoutos and Hoehndorf]{obml2011h1}
%DIF <  Gkoutos, G.~V. and Hoehndorf, R. (2011).
%DIF <  \newblock Ontology-based cross-species integration and analysis of
%DIF <    saccharomyces cerevisiae phenotypes.
%DIF <  \newblock In {\em Proceedings of the 3rd Workshop for Ontologies in Biomedicine
%DIF <    and Life sciences (OBML)\/}.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Gkoutos {\em et~al.}(2005)Gkoutos, Green, Mallon, Hancock, and
%DIF <    Davidson]{Gkoutos2005}
%DIF <  Gkoutos, G.~V., Green, E.~C., Mallon, A.-M.~M., Hancock, J.~M., and Davidson,
%DIF <    D. (2005).
%DIF <  \newblock {Using ontologies to describe mouse phenotypes.}
%DIF <  \newblock {\em Genome biology\/}, {\bf 6}(1).
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Gkoutos {\em et~al.}(2009)Gkoutos, Mungall, Dolken, Ashburner, Lewis,
%DIF <    Hancock, Schofield, Kohler, and Robinson]{Gkoutos2009b}
%DIF <  Gkoutos, G.~V., Mungall, C., Dolken, S., Ashburner, M., Lewis, S., Hancock, J.,
%DIF <    Schofield, P., Kohler, S., and Robinson, P.~N. (2009).
%DIF <  \newblock Entity/quality-based logical definitions for the human skeletal
%DIF <    phenome using {PATO}.
%DIF <  \newblock {\em Annual International Conference of the IEEE Engineering in
%DIF <    Medicine and Biology Society.}, {\bf 1}, 7069--7072.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Goble and Stevens(2008)Goble and Stevens]{goble}
%DIF <  Goble, C. and Stevens, R. (2008).
%DIF <  \newblock State of the nation in data integration for bioinformatics.
%DIF <  \newblock {\em Journal of Biomedical Informatics\/}, {\bf 41}(5), 687--693.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Grau {\em et~al.}(2008)Grau, Horrocks, Motik, Parsia, Patelschneider,
%DIF <    and Sattler]{Grau2008}
%DIF <  Grau, B., Horrocks, I., Motik, B., Parsia, B., Patelschneider, P., and Sattler,
%DIF <    U. (2008).
%DIF <  \newblock {OWL} 2: The next step for {OWL}.
%DIF <  \newblock {\em Web Semantics: Science, Services and Agents on the World Wide
%DIF <    Web\/}, {\bf 6}(4), 309--322.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Gruber(1995)Gruber]{Gruber1995}
%DIF <  Gruber, T.~R. (1995).
%DIF <  \newblock Toward principles for the design of ontologies used for knowledge
%DIF <    sharing.
%DIF <  \newblock {\em International Journal of Human-Computer Studies\/}, {\bf
%DIF <    43}(5-6).
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Guarino(1998)Guarino]{Guarino1998}
%DIF <  Guarino, N. (1998).
%DIF <  \newblock Formal ontology and information systems.
%DIF <  \newblock In N.~Guarino, editor, {\em Proceedings of the 1st International
%DIF <    Conference on Formal Ontologies in Information Systems\/}, pages 3--15. IOS
%DIF <    Press.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Herre {\em et~al.}(2006)Herre, Heller, Burek, Hoehndorf, Loebe, and
%DIF <    Michalek]{Herre2006}
%DIF <  Herre, H., Heller, B., Burek, P., Hoehndorf, R., Loebe, F., and Michalek, H.
%DIF <    (2006).
%DIF <  \newblock {G}eneral {F}ormal {O}ntology ({GFO}) -- {A} foundational ontology
%DIF <    integrating objects and processes [{V}ersion 1.0].
%DIF <  \newblock Onto-Med Report~8, IMISE, University of Leipzig, Leipzig, Germany.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Hoehndorf {\em et~al.}(2010a)Hoehndorf, Oellrich, and
%DIF <    Rebholz-Schuhmann]{Hoehndorf2010phene}
%DIF <  Hoehndorf, R., Oellrich, A., and Rebholz-Schuhmann, D. (2010a).
%DIF <  \newblock Interoperability between phenotype and anatomy ontologies.
%DIF <  \newblock {\em Bioinformatics\/}, {\bf 26}(24), 3112 -- 3118.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Hoehndorf {\em et~al.}(2010b)Hoehndorf, Oellrich, Dumontier, Kelso,
%DIF <    Rebholz-Schuhmann, and Herre]{Hoehndorf2010patterns}
%DIF <  Hoehndorf, R., Oellrich, A., Dumontier, M., Kelso, J., Rebholz-Schuhmann, D.,
%DIF <    and Herre, H. (2010b).
%DIF <  \newblock Relations as patterns: Bridging the gap between {OBO} and {OWL}.
%DIF <  \newblock {\em BMC Bioinformatics\/}, {\bf 11}(1), 441+.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Horridge {\em et~al.}(2007)Horridge, Bechhofer, and
%DIF <    Noppens]{Horridge2007}
%DIF <  Horridge, M., Bechhofer, S., and Noppens, O. (2007).
%DIF <  \newblock Igniting the {OWL} 1.1 touch paper: The {OWL} {API}.
%DIF <  \newblock In {\em Proceedings of OWLED 2007: Third International Workshop on
%DIF <    OWL Experiences and Directions\/}.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Horrocks(2007)Horrocks]{Horrocks2007}
%DIF <  Horrocks, I. (2007).
%DIF <  \newblock {OBO} flat file format syntax and semantics and mapping to {OWL}
%DIF <    {W}eb {O}ntology {L}anguage.
%DIF <  \newblock Technical report, University of Manchester.
%DIF <  \newblock \url{http://www.cs.man.ac.uk/~horrocks/obo/}.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Kazakov {\em et~al.}(2011)Kazakov, Kr{\"o}tzsch, and
%DIF <    Siman\v{c}\'{i}k]{Kazakov2011}
%DIF <  Kazakov, Y., Kr{\"o}tzsch, M., and Siman\v{c}\'{i}k, F. (2011).
%DIF <  \newblock Unchain my $\mathcal{EL}$ reasoner.
%DIF <  \newblock In {\em Proceedings of the 23rd International Workshop on Description
%DIF <    Logics (DL'10)\/}, CEUR Workshop Proceedings. CEUR-WS.org.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Lock and Strömblad(2010)Lock and Strömblad]{Lock2010}
%DIF <  Lock, J.~G. and Strömblad, S. (2010).
%DIF <  \newblock Systems microscopy: an emerging strategy for the life sciences.
%DIF <  \newblock {\em Experimental Cell Research\/}, {\bf 316}(8), 1438--1444.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Mungall {\em et~al.}(2010a)Mungall, Gkoutos, Smith, Haendel, Lewis,
%DIF <    and Ashburner]{Mungall2010}
%DIF <  Mungall, C., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., and Ashburner, M.
%DIF <    (2010a).
%DIF <  \newblock Integrating phenotype ontologies across multiple species.
%DIF <  \newblock {\em Genome Biology\/}, {\bf 11}(1), R2+.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Mungall {\em et~al.}(2010b)Mungall, Bada, Berardini, Deegan, Ireland,
%DIF <    Harris, Hill, and Lomax]{Mungall2010go}
%DIF <  Mungall, C.~J., Bada, M., Berardini, T.~Z., Deegan, J., Ireland, A., Harris,
%DIF <    M.~A., Hill, D.~P., and Lomax, J. (2010b).
%DIF <  \newblock Cross-product extensions of the gene ontology.
%DIF <  \newblock {\em Journal of biomedical informatics\/}.
%DIF <  \newblock in press.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Neumann {\em et~al.}(2010)Neumann, Walter, Hériché, Bulkescher,
%DIF <    Erfle, Conrad, Rogers, Poser, Held, Liebel, and et~al.]{Neumann2010}
%DIF <  Neumann, B., Walter, T., Hériché, J.-K., Bulkescher, J., Erfle, H., Conrad,
%DIF <    C., Rogers, P., Poser, I., Held, M., Liebel, U., and et~al. (2010).
%DIF <  \newblock Phenotypic profiling of the human genome by time-lapse microscopy
%DIF <    reveals cell division genes.
%DIF <  \newblock {\em Nature\/}, {\bf 464}(7289), 721--727.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Ogren {\em et~al.}(2004)Ogren, Cohen, Acquaah-Mensah, Eberlein, and
%DIF <    Hunter]{Ogren2004}
%DIF <  Ogren, P.~V., Cohen, K.~B., Acquaah-Mensah, G.~K., Eberlein, J., and Hunter, L.
%DIF <    (2004).
%DIF <  \newblock The compositional structure of gene ontology terms.
%DIF <  \newblock {\em Pac Symp Biocomput\/}, pages 214--225.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[\"{O}zg\"{o}vde and Gr\"{u}ninger(2010)\"{O}zg\"{o}vde and
%DIF <    Gr\"{u}ninger]{Gruninger2010}
%DIF <  \"{O}zg\"{o}vde, A. and Gr\"{u}ninger, M. (2010).
%DIF <  \newblock Foundational process relations in bio-ontologies.
%DIF <  \newblock In {\em Proceeding of the 2010 conference on Formal Ontology in
%DIF <    Information Systems\/}, pages 243--256, Amsterdam, The Netherlands, The
%DIF <    Netherlands. IOS Press.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Robinson {\em et~al.}(2008)Robinson, Koehler, Bauer, Seelow, Horn, and
%DIF <    Mundlos]{Robinson2008}
%DIF <  Robinson, P.~N., Koehler, S., Bauer, S., Seelow, D., Horn, D., and Mundlos, S.
%DIF <    (2008).
%DIF <  \newblock The human phenotype ontology: a tool for annotating and analyzing
%DIF <    human hereditary disease.
%DIF <  \newblock {\em American journal of human genetics\/}, {\bf 83}(5), 610--615.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Schindelman {\em et~al.}(2011)Schindelman, Fernandes, Bastiani, Yook,
%DIF <    and Sternberg]{wpo}
%DIF <  Schindelman, G., Fernandes, J., Bastiani, C., Yook, K., and Sternberg, P.
%DIF <    (2011).
%DIF <  \newblock Worm phenotype ontology: integrating phenotype data within and beyond
%DIF <    the c. elegans community.
%DIF <  \newblock {\em BMC Bioinformatics\/}, {\bf 12}(1), 32.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Schmitz {\em et~al.}(2010)Schmitz, Held, Janssens, Hutchins, Hudecz,
%DIF <    Ivanova, Goris, Trinkle-Mulcahy, Lamond, Poser, Hyman, Mechtler, Peters, and
%DIF <    Gerlich]{Schmitz2010}
%DIF <  Schmitz, M. H.~A., Held, M., Janssens, V., Hutchins, J. R.~A., Hudecz, O.,
%DIF <    Ivanova, E., Goris, J., Trinkle-Mulcahy, L., Lamond, A.~I., Poser, I., Hyman,
%DIF <    A.~A., Mechtler, K., Peters, J.-M., and Gerlich, D.~W. (2010).
%DIF <  \newblock Live-cell imaging rnai screen identifies pp2a–b55$\alpha$ and
%DIF <    importin-$\beta$1 as key mitotic exit regulators in human cells.
%DIF <  \newblock {\em Nature Cell Biology\/}, {\bf 12}, 886--893.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Schofield {\em et~al.}(2011)Schofield, Sundberg, Hoehndorf, and
%DIF <    Gkoutos]{Schofield2011}
%DIF <  Schofield, P.~N., Sundberg, J.~P., Hoehndorf, R., and Gkoutos, G.~V. (2011).
%DIF <  \newblock New approaches to the representation and analysis of phenotype
%DIF <    knowledge in human diseases and their animal models.
%DIF <  \newblock {\em Briefings in Functional Genomics\/}, {\bf 10}(5), 258--265.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Smith {\em et~al.}(2007)Smith, Ashburner, Rosse, Bard, Bug, Ceusters,
%DIF <    Goldberg, Eilbeck, Ireland, Mungall, Leontis, Serra, Ruttenberg, Sansone,
%DIF <    Scheuermann, Shah, Whetzel, and Lewis]{Smith2007}
%DIF <  Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., Ceusters, W., Goldberg,
%DIF <    L.~J., Eilbeck, K., Ireland, A., Mungall, C.~J., Leontis, N., Serra, P.~R.,
%DIF <    Ruttenberg, A., Sansone, S.~A., Scheuermann, R.~H., Shah, N., Whetzel, P.~L.,
%DIF <    and Lewis, S. (2007).
%DIF <  \newblock The {OBO} {F}oundry: coordinated evolution of ontologies to support
%DIF <    biomedical data integration.
%DIF <  \newblock {\em Nat Biotech\/}, {\bf 25}(11), 1251--1255.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \bibitem[Smith {\em et~al.}(2004)Smith, Goldsmith, and Eppig]{Smith2004}
%DIF <  Smith, C.~L., Goldsmith, C.-A.~W., and Eppig, J.~T. (2004).
%DIF <  \newblock The mammalian phenotype ontology as a tool for annotating, analyzing
%DIF <    and comparing phenotypic information.
%DIF <  \newblock {\em Genome Biology\/}, {\bf 6}(1), R7.
%DIFDELCMD < 

%DIFDELCMD < %%%
%DIF <  \end{thebibliography}
\DIFdelend \DIFaddbegin \bibliographystyle{plain}
\DIFaddend 

%DIF < \bibliography{/home/leechuck/Documents/papers/bibtex/lc}
\DIFaddbegin \bibliography{/home/leechuck/Documents/papers/bibtex/lc}
\DIFaddend 

\end{document}
